-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Initial architecture for AWS CLI v1-to-v2 upgrade linting tool #9803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: upgrade-linting-tool
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## upgrade-linting-tool #9803 +/- ##
=======================================================
Coverage ? 93.33%
=======================================================
Files ? 209
Lines ? 16807
Branches ? 0
=======================================================
Hits ? 15687
Misses ? 1120
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking good, I like the general experience. Left a few questions and suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the purpose of this Makefile. It looks like it can be largely replaced with modern Python dev tools like uv and ruff (also curious why we're using black and isort here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair, I'll look into uv and ruff, and potentially delete this Makefile and update the README dev guide
| For a full list of the breaking changes introduced with AWS CLI v2, see | ||
| [Breaking changes between AWS CLI version 1 and AWS CLI version 2](https://docs.aws.amazon.com/cli/latest/userguide/cliv2-migration-changes.html#cliv2-migration-changes-breaking). | ||
|
|
||
| ## Installation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the idea was to publish this as a separate PyPI package? Why are we instructing users to build from source?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so this was moreso intended for the devs contributing to this package. I agree most customers would be installing it from PyPi. I can try to make that more clear in the next revision with adding clarifications to this README, let me know if you have additional asks for this
| 2. Install dependencies: | ||
| ```bash | ||
| pip install -r requirements.txt | ||
| pip install -r requirements-dev-lock.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we anticipate users to require dev dependencies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my last comment, this workflow was intended for dev contributors (i.e. us). The dev dependencies are for PyTest and format packages to test and format the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have setup.py in addition to pyproject.toml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, after some research it seems we only really need one in our case, and that pyproject.toml is the favored way. I will delete setup.py in the next revision
| print("Invalid choice. Please enter y, n, u, or q.") | ||
|
|
||
|
|
||
| def display_finding(finding: LintFinding, index: int, total: int, script_content: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that's confusing is that a line can show up as a diff even if nothing changed. Using the example file in this commit:
-aws secretsmanager put-secret-value --secret-id secret1213 \
- --secret-binary file://data.json
+aws secretsmanager put-secret-value --secret-id secret1213 \
+ --secret-binary file://data.json --cli-binary-format raw-in-base64-out
Is this intended behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I tried to keep the diff-generation code simple while also giving a somewhat familiar experience to git diff, and I think this behavior is an artifact of that.
We can pursue more complex diffing logic to address this. If we go down this route, I'd likely utilize difflib which is included in the standard Python library. It has a unified_diff function which may produce what we're looking for.
Below is some simple code I tested locally, seems to show the desired behavior:
>>> src_lines = 'aws secretsmanager put-secret-value --secret-id secret1213 \\\n --secret-binary file://data.json\n'.splitlines(keepends=True)
>>> dest_lines = 'aws secretsmanager put-secret-value --secret-id secret1213 \\\n --secret-binary file://data.json --cli-binary-format raw-in-base64-out\n'.splitlines(keepends=True)
>>> diff2 = difflib.unified_diff(src_lines, dest_lines)
>>> sys.stdout.writelines(diff2)
---
+++
@@ -1,2 +1,2 @@
aws secretsmanager put-secret-value --secret-id secret1213 \
- --secret-binary file://data.json
+ --secret-binary file://data.json --cli-binary-format raw-in-base64-out
I'm leaning towards going this route, please share feedback on this. It would likely simplify some of our diff-generation-code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that looks great for our use case
| src_lines_removed = end_line - start_line + 1 | ||
| new_lines_added = len(dest_lines) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding everything correctly, then src_lines_removed should always be equal to new_lines_added, right? If so, should we add validation on that to guard against potentially corrupting the source file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now, yes I expect them to be equal. By calculating it like this it does not lock us in to that as a requirement.
I can imagine down the line us suggesting a 'fix' that takes up multiple lines, in that case they would not be equal. However, we currently have no plans for any of the linting rules to suggest fixes that take up multiple lines. It'll pretty much always be adding a flag to the last line or modifying the command or argument.
I can add this assertion, and there's always the option to remove it if we ever have suggestions for adding multi-line suggestions. Are you recommending it's added as a runtime check? Or introducing a test for this?
I'm leaning towards a runtime check, and raising RuntimeError if they're not equal, with comments explaining some things in this thread. That'll be my plan for next revision unless you share disagreements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think we should just stick to line-by-line changes (ie always 1 line changed per 1 line affected). Go ahead and add the runtime check
| elif choice == "q": | ||
| print("Cancelled.") | ||
| sys.exit(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a bit confusing if a user accepts a few fixes and then presses q to exit early with the expectation that the accepted fixes would be applied.
- We should explicitly document this behavior.
- [non-blocking suggestion] And also consider adding a [s]ave option that "flushes" accepted fixes and applies them. If you think we should do this, okay with doing it in a followup PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense. Now that I think about this, I think we can let q be the save and exit option that you're suggesting. Users can always use Ctrl+C to terminate the program immediately.
Let me know if you agree with this. In either case, I would like to do it in a followup PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm let's discuss this with the team. I can see advantages to different approaches here
| def display_finding(finding: LintFinding, index: int, total: int, script_content: str): | ||
| """Display a finding to the user with context.""" | ||
| src_lines = script_content.splitlines() | ||
| dest_lines = finding.edit.inserted_text.splitlines() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To calculate new_lines_added, is there a reason we're not using LintFinding.line_start and LintFinding.line_end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this relates to a previous reply in another comment. I left open the possibility that there could be more "new lines" than "source lines". line_start and line_end are the line numbers in the source file.
If we end up deciding to be more rigid and force that the number of "new lines" equals the number of "source lines" with an assertion, we may be able use the variables you suggested instead.
With this information, please advise on if you want me to proceed with the more rigid approach, with the assertion that source == new lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Echoing my other reply: Yeah let's go ahead and make the assertion
Description of changes:
Description of tests:
Example output:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.